NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

https://doi.org/10.52202/079017-0437

Gong, Yifan; Kong, Zhenglun; Meng, Zichong; Niu, Wei; Wang, Yanzhi; Wu, Yushu; Yang, Changdi; Yuan, Geng; Zhan, Zheng; Zhao, Pu (January 2024, Neural Information Processing Systems Foundation, Inc. (NeurIPS))

Full Text Available
Search for Efficient Large Language Models

https://doi.org/10.52202/079017-4421

Gong, Yifan; Kong, Zhenglun; Lin, Ming; Lin, Xue; Shen, Xuan; Wang, Yanzhi; Wu, Chao; Wu, Yushu; Zhan, Zheng; Zhao, Pu (January 2024, Neural Information Processing Systems Foundation, Inc. (NeurIPS))

Full Text Available
Exploring Token Pruning in Vision State Space Models

https://doi.org/10.52202/079017-1613

Gong, Yifan; Ioannidis, Stratis; Kong, Zhenglun; Meng, Zichong; Niu, Wei; Shen, Xuan; Wang, Yanzhi; Wu, Yushu; Zhan, Zheng; Zhao, Pu; et al (January 2024, Neural Information Processing Systems Foundation, Inc. (NeurIPS))

Full Text Available
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge

https://doi.org/10.1109/CVPR52729.2023.01478

Yang, Changdi; Zhao, Pu; Li, Yanyu; Niu, Wei; Guang, Jiexiong; Tang, Hao; Qin, Minghai; Ren, Bin; Lin, Xue; Wang, Yanzhi (June 2023, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023)

Full Text Available
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge

https://doi.org/10.1109/CVPR52729.2023.01478

Yang, Changdi; Zhao, Pu; Li, Yanyu; Niu, Wei; Guan, Jiexiong; Tang, Hao; Qin, Minghai; Ren, Bin; Lin, Xue; Wang, Yanzhi (June 2023, The Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR-23))

Full Text Available
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge

Yang, Changdi; Zhao, Pu; Li, Yanyu; Niu, Wei; Guan, Jiexiong; Tang, Hao; Qin, Minghai; Ren, Bin; Lin, Xue; Wang, Yanzhi (June 2023, The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR))

With the ever-increasing popularity of edge devices, it is necessary to implement real-time segmentation on the edge for autonomous driving and many other applications. Vision Transformers (ViTs) have shown considerably stronger results for many vision tasks. However, ViTs with the fullattention mechanism usually consume a large number of computational resources, leading to difficulties for realtime inference on edge devices. In this paper, we aim to derive ViTs with fewer computations and fast inference speed to facilitate the dense prediction of semantic segmentation on edge devices. To achieve this, we propose a pruning parameterization method to formulate the pruning problem of semantic segmentation. Then we adopt a bi-level optimization method to solve this problem with the help of implicit gradients. Our experimental results demonstrate that we can achieve 38.9 mIoU on ADE20K val with a speed of 56.5 FPS on Samsung S21, which is the highest mIoU under the same computation constraint with real-time inference.
more » « less
Full Text Available
The Autonomous Vehicle Assistant (AVA): Emerging technology design supporting blind and visually impaired travelers in autonomous transportation

https://doi.org/10.1016/j.ijhcs.2023.103125

Fink, Paul D.S.; Doore, Stacy A.; Lin, Xue; Maring, Matthew; Zhao, Pu; Nygaard, Aubree; Beals, Grant; Corey, Richard R.; Perry, Raymond J.; Freund, Katherine; et al (November 2023, International Journal of Human-Computer Studies)

Full Text Available
Advancing Model Pruning via Bi-level Optimization

Zhang, Yihua; Yao, Yuguang; Ram, Parikshit; Zhao, Pu; Chen, Tianlong; Hong, Mingyi; Wang, Yanzhi; Liu, Sijia (December 2022, 36th Conference on Neural Information Processing Systems (NeurIPS 2022))
Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization

https://doi.org/10.24963/ijcai.2022/449

Li, Yanyu; Zhao, Pu; Yuan, Geng; Lin, Xue; Wang, Yanzhi; Chen, Xin (July 2022, Thirty-First International Joint Conference on Artificial Intelligence)

Neural architecture search (NAS) and network pruning are widely studied efficient AI techniques, but not yet perfect.NAS performs exhaustive candidate architecture search, incurring tremendous search cost.Though (structured) pruning can simply shrink model dimension, it remains unclear how to decide the per-layer sparsity automatically and optimally.In this work, we revisit the problem of layer-width optimization and propose Pruning-as-Search (PaS), an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.Specifically, we add a depth-wise binary convolution to learn pruning policies directly through gradient descent.By combining the structural reparameterization and PaS, we successfully searched out a new family of VGG-like and lightweight networks, which enable the flexibility of arbitrary width with respect to each layer instead of each stage.Experimental results show that our proposed architecture outperforms prior arts by around 1.0% top-1 accuracy under similar inference speed on ImageNet-1000 classification task.Furthermore, we demonstrate the effectiveness of our width search on complex tasks including instance segmentation and image translation.Code and models are released.
more » « less
Full Text Available
Towards Real-Time Segmentation on the Edge

Li, Yanyu; Yang, Changdi; Zhao, Pu; Yuan, Geng; Niu, Wei; Guan, Jiexiong; Tang, Hao; Qin, Minghai; Ren, Bin; Lin, Xue; et al (February 2023, AAAI'23: The Thirty-Seventh AAAI Conference on Artificial Intelligence)

There have been many recent attempts to extend the successes of convolutional neural networks (CNNs) from 2-dimensional (2D) image classification to 3-dimensional (3D) video recognition by exploring 3D CNNs. Considering the emerging growth of mobile or Internet of Things (IoT) market, it is essential to investigate the deployment of 3D CNNs on edge devices. Previous works have implemented standard 3D CNNs (C3D) on hardware platforms, however, they have not exploited model compression for acceleration of inference. This work proposes a hardware-aware pruning approach that can fully adapt to the loop tiling technique of FPGA design and is applied onto a novel 3D network called R(2+1)D. Leveraging the powerful ADMM, the proposed pruning method achieves simultaneous high accuracy and significant acceleration of computation on FPGA. With layer-wise pruning rates up to 10× and negligible accuracy loss, the pruned model is implemented on a Xilinx ZCU102 FPGA board, where the pruned model achieves 2.6× speedup compared with the unpruned version, and 2.3× speedup and 2.3× power efficiency improvement compared with state-of-the-art FPGA implementation of C3D.
more » « less
Full Text Available

« Prev Next »

Search for: All records